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Abstract 

PGR (Polymerase Chain Reaction), a method which rephcates a selected sequence of DNA, 
has revolutionized the study of genomic material, but mathematical study of the process has 
been limited to simple deterministic models or descriptions relying on stochastic processes. 
In this paper we develop a suite of deterministic models for the reactions of quantitative PGR 
(Polymerase Ghain Reaction) based on the law of mass action. Maps are created from DNA copy 
number in one cycle to the next, with ordinary differential equations describing the evolution of 
difference molecular species during each cycle. Qualitative analysis is preformed at each stage 
and parameters are estimated by fitting each model to data from Roche LightGycler (TM) 
runs. 



1 Introduction 



The Polymerase Chain Reaction (PGR), a technique for the enzymatic amplification of specific 
target segments of DNA, has revolutionized molecular biological approaches involving genomic 
material. This, in turn, has impacted research in human genetics, disease diagnosis, cancer de- 
tection, evolutionary and developmental biology, and pathogen detection, to name a few. The 
company Idaho Technology, Inc. has capitalized on the invention of fiuorescent probe techniques 
to create fast, accurate devices for quantitative PGR. Quantitative PGR is a method where the 
amount of amplified DNA (or amplicon) is tracked throughout the reaction and the initial amount 
of sample DNA can then be estimated. Understanding the important parts of a complex reaction 
that is repeated tens of times, is critical in improving the design of these processes in the labora- 
tory, and to date theoretical studies of quantitative PGR are limited. In this paper we present a 
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suite of deterministic models for quantitative PGR, with parameters estimated from data provided 
from Roche LightCycler (TM) PGR runs. Determining the critical features of the model through 
construction of increasingly complex descriptions of the reaction is the overall goal of the project. 

In PGR a reaction mixture containing a few copies of the target double-stranded DNA is first 
heated to separate the DNA into single strands. It is then rapidly cooled and held at a lower 
temperature briefly so that PGR primers (short single strands of DNA that have been designed 
for this purpose) anneal specifically to the template DNA. The enzyme Taq Polymerase recognizes 
these primer-template pairs and synthesizes a new strand of DNA, starting at the end of the 
annealed primer. In this way, a complementary strand is made from each strand of the original 
double-stranded DNA molecule. Under ideal reaction conditions the number of copies of this 
stretch of DNA in the sample is doubled in each heating-cooling cycle. 

Instruments that perform real-time PGR usually detect the amplified DNA using fluorescent probes 
that are added to the PGR reagents before temperature cycling. These probes bind to the DNA and 
generally fluoresce more when bound than when free. When there is a sufficient quantity of DNA 
present in the sample (for example, after many temperature cycles), this change in fluorescence is 
detected using a fluorimeter. If the fluorescent signal of a sample rises above a background level, 
a sizable amount of DNA has been synthesized, indicating that the specific DNA was initially 
present. 

Gurrent methods for DNA quantification (for more information see the following references: Mor- 
rison et al. (1998), Wittwer et al. (1994, 1997), Weiss and Von Haeseler (1997), Sun (1995), Sun 
et al. (1996)) with PGR are based on comparing a set of successively diluted standards against 
unknown samples. The methods utilize the concentrations of the standards in a dilution series to 
determine the concentration of the unknown. The amount of DNA in successively diluted stan- 
dards is typically decreased by factors of 2 or 10, and anywhere from three to ten standards are 
used. Figure 1 shows a set of six standards containing between one and 1,000,000 copies of ini- 
tial DNA template. The fiuorescence curve that crosses the threshold value at the smallest cycle 
number initially had 1,000,000 copies of DNA, the next curve to cross the threshold had 100,000 
copies of DNA initially, and so on. Notice that the curve that does not cross the threshold is the 
control-no-templatc sample. Also plotted on this graph are two sets of replicates with unknown 
initial quantities of DNA template. 

A quick estimate of the order of magnitude of the number of copies of DNA initially in the unknown 
samples can be found by simply comparing the amplification curves of the samples and diluted 
standards. Gurrent methods produce more precise estimates using a mathematical model of PGR 
that assumes the product grows exponentially: 

Cn+l = Cn + ECn = (1 + E)Cn = (1 + EYCq. (1) 

In this model, C represents the number of copies of DNA, n represents the cycle number and E 

represents the efficiency of the PGR. E can be thought of as the percentage of existing DNA that 
is replicated in a cycle. Dilution standards have known values for Co, and data from these samples 
can be used to calculate the efficiency E. Given the efficiency, the initial copy number Cq can be 
estimated for each unknown sample. 

This model is accurate for a small number of cycles, but grows less and less accurate as the number 
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cycle number 

Figure 1: Fluorescence level vs. cycle number during PGR Roche Lightcycler run. Different lines 
are standard dilutions for quantification purposes, from 10^ copies down to 10^ and no template 
as a control. Also run simultaneously are two samples of unknown concentration, five replicates 
each. 

of DNA copies grows. Unfortunately, the fluorescence signal can be distinguished from the noise 
only later in the experiment, precisely as the model becomes less accurate. The simplest mistake 
in the model is the assumption that the efficiency does not change with cycle number, and that 
the number of copies of DNA always grows. In reality, PGR products saturate the reaction and 
resources are exhausted, slowing and eventually stopping DNA synthesis. 

The amplification curves thus suggest that a more natural model for the PGR reaction would 
be logistic, which proceeds to saturation as a resource is depleted. Current IT software fits the 
data to such a logistic map, and uses the result to estimate initial copy number of the template. 
For the purpose of this estimation both the exponential growth model and the logistic model are 
sufficient in many cases, and have the advantage of a limited number of free parameters, requiring 
a minimum amount parameter estimation. However, for the long range goal of developing a more 
complete model of the reaction that can lead to innovation in process design, we must look beyond 
these one dimensional approximations. We also see that the data deviate from the logistic model 
in a consistent way for all the amplification curves, suggesting that the simplifications leading to 
it eliminate some critical features of the dynamics. 

To our knowledge no deterministic model of the reactions of PGR that does not include assumptions 
about the kind of enzyme kinetics involved (i.e. Michaelis-Menten) are present in the literature. 
Stochastic models for estimating reaction efficiency and specificity can be found however, for 
instance, in Sun (1995) a model for distributions of mutations and estimation of mutation rates 
during PGR is developed, using the theory of branching processes. Another such model is reported 
in Weiss and Von Haeseler (1995), where the accumulation of new molecules during PGR is treated 
as a randomly bifurcating tree to estimate overall error rates for the reaction. In Schnell and 
Mendoze (1997), the reaction efficiency of quantitative competitive PGR (QG-PGR: a target and 
a competitor template are amplified simultaneously to provide an internal standard for identifying 
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the initial target template amount) is computed using Michaelis-Menten type kinetics. 

Stolvitzky and Cecchi (1996) address the validity of assuming a constant efficiency during a PGR 
reaction by deriving the probability of replication during successive cycles as a function of physical 
parameters. In the same vein, Velikanov and Kapral (1999) report on a probabilistic model of 
the kinetics of PGR using microscopic Markov processes. The result is an exact solution for the 
distribution of lengths of synthesized DNA strands, and an optimization procedure is applied to 
determine control parameters that maximized the yield of the target sequence. Most recently, in 
a 2004 publication Whitney et al. describe a stochastic model for competitive interactions during 
PGR to compute product distributions at the completion of regular PGR. The calculated yield 
is compared to experimental values from the amplification of three different size amplicons, with 
good results. 

In this paper we develop deterministic models based directly on the reaction equations using the 
law of mass action. A hierarchy of models is built by including more biochemistry into each 
successive level of approximation. We analyze qualitatively and numerically the solutions to the 
models under typical operating conditions, and perform parameter estimation with data provided 
by Idaho Technology. Finally the advantages and disadvantages of including more details of the 
reactions into the model are discussed. 
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2 The Reactions of PGR 



Tab 


c I: List of Variables and Notation Used in PGR Models 


Variable name 


quantity 


C 


copy number 


E 


exponential efficiency of reaction 


S, s 


single stranded DNA (ssDNA), s = [S] 


P,p 


primer molecule, p = [P] 


S',s' 


primed ssDNA, s' = [S'] 


Q,q 


Taq molecule, q = [Q] 


C,c 


enzyme complex, c = [C] 


N,n 


nucleotide sequence for the extension, n = [N] 


D,d 


double stranded DNA (dsDNA), d = [D] 


k-i,ki 


forward and backward reaction rates for annealing 




forward and backward reaction rates for complex formation 




forward and backward reaction rates for extension 


e 


logistic map growth parameter 


S 


logistic model growth parameter 


/C 


carrying capacity of the logistic map 


r{di) 


growth parameter function for Taq model 


e, a 


parameters in r(di) 


Wi 


estimation of r{di) from experimental data 


Yi 


logarithmic regression variable 


At 


time step in discrete version of logistic model 


h 


time in stage I of two stage model without Taq dynamics 


h 


time in stage II of two stage model without Taq dynamics 


Tl 


rescaled time in stage I of two stage model without Taq dynamics 


Til 


rescaled time in stage II of two stage model without Taq dynamics 




conserved quantities in the two stage models 


KK 


normalization parameter used for experimental data 


At 


rescaled reaction rates in the two stage model with Taq dynamics 




s' in stage I and stage II respectively 


X 


fixed point of the x variable 


ti,tn 


time in stage I and stage II in model with Taq dynamics 



The PGR reaction proceeds through repeated cycles of dissociation, annealing and extension by 
the enzyme Taq polymerase. During dissociation the sample is heated to approximately 90 degrees 
C where the template's DNA nucleotide base pairs unbind and the strand essentially unzips to 
form two half-strands (single stranded DNA). The sample is then cooled to a temperature where 
the primer reaction is optimal (about 60 degrees G), during which primer molecules, themselves 
sequences of single stranded DNA that have been designed to adhere to either end of the target 
sequence of the template, bind on. Then the sample is heated again to a temperature where 
Taq enzyme adds base pairs on the bracketed sequence to form a new double-stranded piece of 
DNA. The annealing/extension can done in one or two distinct steps, either with a continuous 
ramp-up to the Taq operating temperature (during which time the primers anneal) or with a lower 
temperature annealing stage followed by a higher temperature extension phase. We model the 
latter, but the model itself could easily be adapted for the one-step scenario. 
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These three phases, dissociation, anneahng, and extension, are repeated typically 30-40 times 
to yield exponentially growing numbers of the target sequence, assuming the reaction runs as 
designed. Factors influencing the success of the reaction (is there a product?) are competition from 
contaminants in the reaction mixture, primers that bind to themselves or other primer molecules 
(primer-dimers) , or primers that can extend pieces of the template other than the target, to name 
a few. Naturally the reaction saturates, see fig. ^ which is assumed to occur by complete depletion 
of primer molecules, since they are incorporated into the extended strands. The nucleotides in 
the mixture could also be used up, but typically they are present in great numbers to prevent 
this from occurring. Primers are synthesized molecules and are therefore much more costly than 
nucleotides. Also, the initial amount of DNA to be amplified can not be either too large or too 
small. If it is too large the number of primers is not sufficient to completely prime the molecules, 
and if too small it can lose out to the competing amplification of undesired sequences. 

The reaction equations for these phases can be written as follows. 
Dissociation: 

Annealing: 

fei , 
S + P =t S' 



k- 



Extension: 



S' + Q ^ C 

k-2 



C+N h D+Q 

fc-3 



Here D is double-stranded DNA, S is single-stranded DNA, P is primer, S' , primed single-stranded 
DNA, Q, Taq polymerase, C, complex of primed single-stranded DNA, P', and Taq, N, nucleotides. 
The plus/minus /c's represent the forward and backward reaction rate respectively. Ideally the 
reactions form a cascade, the product of one reaction continues into the next reaction and the 
final double-stranded DNA cycles back to the dissociation phase. In reality the reactions occur 
simultaneously, with highest frequency at their optimal temperature. For our purpose we will 
treat the phases as distinct and cascade the output of one phase to the input of the next. We 
also assume that the back reactions are negligible compared to the forward reactions in all but the 
creation of the enzyme complex, e.g. /c-i = k-s = 0. 

The law of mass action can be invoked to create differential equations for the concentrations of 
the above reactants, and we use lower case letters to indicate these concentrations, e.g. [S] = s, 
[D] = d, etc. We assume that the resource, nucleotides, is present in chunks of appropriate 
sequences of base pairs for the segment of DNA being extended. That is, we will assume that the 
extension happens all at once, not one base pair at a time. For the annealing reaction we have: 

- = -k^sp (2) 
I = -kisp (3) 
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And for the extension phase the equations are: 

^ = -k2s'q + k^2C (5) 

dq , , , 

— = -k2S q + k_2C + kscn (6) 

dc 

— = k2s'q - k^2C - k^cn (7) 

dn . 

- = -k,cn (8) 

dd 

- = kscn. (9) 

The exponential model (pQ) is a first level of approximation to the growth of double-stranded DNA 
created in these reactions, and the next level of simplification is the logistic map. This can be 
arrived at in a straightforward manner from these reaction equations by first ignoring the enzyme 
(Taq) dynamics. Working with the set of equations that result if the Taq dynamics is ignored we 
have: 

Annealing (as above): 



Extension: 



ds 

'dt ' 
dp 
dt ' 
ds[ 
lit 

ds^ 

Itt ~ 

dn 
lit " 

dd 
Itt 



— kisp 
—kisp 
+kisp 

-kss'n (10) 
-kss'n (11) 
kss'n (12) 



If the annealing stage is assumed to achieve 100% priming, the output of the first three equations 
is s'{tend) = s(io)) which in turn becomes the initial condition into the extension phase. Using 
the conserved quantity d + n\n the extension phase, writing d + n = K, = d{Q) + n(0) and setting 
n = IC — d makes the equation for d: 

^ = k;s'{K.-d). 
dt 

Taking an Euler step approximation to this ODE yields: 

Ad = fe3Ats'(/C-(i). 
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If the time step At is taken to be the time for one cycle, the result is a map for amounts of s', d, 
and n from one cycle to the next. For the i-th cycle: 

Ad = dj+i — di = es[{IC — di). 

where e = Atk^. With perfect priming = Sj, the amount of single stranded DNA at the beginning 
of the priming phase, and with perfect dissociation si = 2di, the amount of double stranded DNA 
from the end of last extension phase. The equation for d then becomes 

di+i= di + e2di{}C - di), (13) 

which is a logistic map for di. 

To test the assumption of logistic data, and we fit a dilution series from a Roche LightCycler 
(TM) run to ()13|1. The run was typical for these quantification experiments: it had 45 cycles, 
each consisting of a brief melt stage at 95 degrees C, a 10 second annealing stage at 55 degrees C, 
and a 30 second extension stage at 72 degrees C. The fluorescence acquisition occurred at the end 
of the annealing stage, and used a FRET (fluorescence resonance energy transfer) probe system. 
FRET probes are a pair of oglionucleotides labeled with fluorescent dyes. The pair are designed 
to hybridize to adjacent regions on the target DNA, and the marker dyes of each probe can only 
interact when they are in close proximity and bound to the target. The fluorophores are chosen so 
that the emission spectrum of one overlaps with the excitation spectrum of the other. The donor 
fluorophore is excited by a light source, transfers its energy to an acceptor fluorophore, which then 
emits light of a longer wavelength. This light is then detected during the fluorescence acquisition. 

The parameter estimation was done in MATLAB using least squares to first compute K and 7, 
and a simplex search method based scheme (MATLAB's fminsearch) to find the initial fiuorescence 
level. The objective function used in the nonlinear optimization was the two-norm of the difference 
between the model time series and the data. Percentage error was computed by dividing the final 
value of the objective function by the two-norm of the model time series. 

Table II: Parameter Estimation for the Logistic Model (|13() 
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K 


do 


% error 


run 1 


0.0185 


8.1473 


2.9 X 10"^ 


5.15 


run 2 


0.0189 


8.2832 


1.6 X 10"^ 


5.8 


run 3 


0.0214 


7.9001 


4.2 X 10"^ 


5.15 


run 4 


0.0296 


6.678 


5.5 X 10-^ 


4.9 


run 5 


0.0457 


4.5959 


1.02 X 10"^ 


5.3 


run 6 


0.1273 


1.5445 


4.12 X 10"^ 


5.82 



The results of the parameter estimation for the dilution series are present in figure |2l and Table 
II. We see that the model is more than adequate for predicting initial copy number, given the 
current practice of running standards simultaneously with samples to generate a map between 
initial copy number and fluorescence level. The drift in the growth constant for decreasing copy 
number indicates that some aspect of the dynamics is not captured by this map, and for the 
lowest copy number run (number 6) we see that the initial amount estimated is off by an order 
of magnitude. In this case competition from other reactions is thought to be the culprit, but in 
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all cases the map is a gross simplification to the actual dynamics. It clearly overestimates the 
growth for earlier cycles, and approaches saturation more quickly at later cycles. To verify this 
intuition quantitatively we preformed a version of logarithmic regression on the first five runs of the 
dilutions series (see figure ^ . The basis for this regression is the separation of variables solution 
to the logistic differential equation: 

^ = 6y{K-y), 

namely 

ln{—^) = K5t + \n{- ^° ^ 



K -y K -yo 

If y follows logarithmic growth the variable Y = ln( j^^^ ) will depend linearly on time t. Plots of 
the discretely sampled variable Yi vs. cycle number i for standard data set are shown in figure El 
where it is obvious that they are not well-estimated by a linear function of i. There are several 
straight line regions in these graphs, corresponding to a) low cycle number noise, b) a region where 
the initial exponential growth occurs, and c) a third region where saturation happens, with a slope 
less than that seen in the second region. A linear fit of this data would have an intermediate slope 
causing an overestimate or an underestimate of the data, depending on the cycle number. This 
is apparent in figure EJ where it also is clear that the data approach the saturation level more 
slowly than logistic growth would warrant. One explanation of this is reduced efficiency of Taq 
polymerase when the quantity of molecules to extend becomes very large. This suggests a growth 
parameter T{di) that varies with the amount of amplicon, d, so that 

di+i = di + T{di)di{K - di), 

where T{di) is a decreasing function of di. The shape of this function can be estimated by plotting 
the variable found by solving the above equation for F: 

^jrr di+i — di 

That is, the graph of Wi vs. di will give an idea of T{di). An example is shown in figure 01 The 
function F appears to be inversely proportional to di, so we fit a new map with T{di) = , so 

d,+i = d, + -^^{K-di). (14) 
1 + adi 

The results are shown in figure and in Table III. The error presented there is the mean square 
error. 
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Figure 2: Fitting dilution series data with a logistic map. See text for parameter values. Solid 
line- model, (+++++)- data. 
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Figure 4: Plotting Wi vs. di to estimate T{di) for the first four dilution series runs. 



Table III: Parameter Estimation of Taq Model (|14j) . 





error 


e 


K 


a 


do 


run 1 


0.1894 


0.1481 


8.5579 


0.9148 


4.91 X 10"'^ 


run 2 


0.2055 


0.1473 


8.7559 


0.8907 


7.31 X 10"*^ 


run 3 


0.2587 


0.1227 


8.3295 


0.6096 


1.20 X 10"'^ 


run 4 


0.3639 


0.1237 


6.9411 


0.4158 


1.06 X 10"'^ 


run 5 


0.3945 


0.1592 


4.7559 


0.4111 


7.53 X 10-^ 


run 6 


0.2394 


0.3721 


1.5740 


0.6636 


3.02 X 10"^ 


sample la 


0.2321 


0.1413 


9.3277 


0.5918 


3.98 X 10^« 


sample lb 


0.2377 


0.1546 


9.5827 


0.7381 


1.04 X 10"*^ 


sample Ic 


0.2189 


0.1231 


9.7042 


0.5957 


1.57 X 10"'^ 


sample Id 


0.2582 


0.1202 


9.9448 


0.6035 


1.60 X 10"'^ 


sample le 


0.2610 


0.1196 


10.0509 


0.6391 


1.62 X 10"'' 


sample 2a 


0.4427 


0.1297 


6.5511 


0.3543 


8.28 X 10~*^ 


sample 2b 


0.4203 


0.1222 


6.8775 


0.3390 


9.03 X 10-« 


sample 2c 


0.4328 


0.1214 


6.9499 


0.3581 


9.08 X 10^*^ 


sample 2d 


0.4223 


0.1173 


7.1260 


0.3541 


1.07 X 10"^ 


sample 2e 


0.4270 


0.1267 


6.9205 


0.3514 


6.42 X 10"^ 



The growth coefficient for this model (e), and the value for a are more consistent than for the 
regular logistic model, though the variation is more pronounced in smaller copy number runs, and 
suffers the same overcalculation of the initial fluorescence in run 6. (Also in run 6 the model 
coefficients are significantly different from the other runs). We also fitted the replicates of the 
unknown samples (sample 1 and sample 2) with reasonably consistent results, though it is clear 
there is a trade-off between values such as the growth constant e and the initial fluorescence, 
indicating hidden dependencies in the parameters that cannot be differentiated with this sort of 
data. 
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Figure 5: Standards data fitted with the Taq model H14|) . See Table II for parameter and error 
information. 
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This could be the end of the story, but an empirically determined rate function is not as satisfactory 
as a model that captures the behavior built-up directly from the reaction equations. In the next 
section we construct such a model and parameterize it with the data. 



3 The Two Stage Model 

We consider here two versions of a two stage model, one that includes the dynamics of the Taq 
enzyme, and one that does not. The latter is much simpler and can be solved analytically, so we 
present it first. Both make the assumption of complete dissociation, so that the amount of ssDNA 
entering the annealing phase is equal to twice the amount of dsDNA from the previous extension 
phase, plus whatever ssDNA was leftover from the previous annealing phase. This eliminates 
the need for the equation that describes dissociation (or the "melt") phase of the reaction. We 
next assume that the annealing phase happens distinct from the extension phase and call this 
stage I. The equations for stage I are ©-Q- The extension phase we name stage II, and with 
Taq dynamics the equations are ((5))-®. Without the Taq dynamics the equations are given by 

cnii-jii. 

3.1 Two Stage Model without Taq Dynamics 

The two stages in both versions are linked through their initial conditions, and in both the initial 
amount of primed ssDNA in stage II is equal to the amount created in stage I, while the initial 
amount of nucleotide in stage II is whatever was left over from the previous cycle of stage II. For 
the model without Taq dynamics, upon completion of stage II any unextended primed ssDNA will 
break-up in dissociation, and thus the initial amount of ssDNA in stage I is that plus the amount 
left-over from the previous stage I, plus twice the amount of double stranded DNA created in stage 
II. The initial amount of primer in stage I is also the sum of the dissociated amount from stage 
II and the amount leftover from the previous stage I. The initial amount of double-stranded DNA 
in stage II is assumed to be zero at the start of every cycle. Written as equations these initial 
conditions are: 

Stage I: 

s(0) = s(iend) previous stage I) -|- s'(iend, previous stage II) -|- 2(i(tendj previous stage II); (15) 
p{0) = s'(iend 5 previous stage II) -|- |?(tend) previous stage I); s'(0) = 0.0; (16) 

Stage II: 

s'(0) = s{t^ndi stage I); n(0) = n(ten(i5 previous stage II); d(0) = 0.0. (17) 

There are two conserved quantities in the equations for both stage I and stage II, so the systems 
can be reduced to one equation each and solved analytically. For stage I we use the conserved 
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quantities: p{t) - s{t) = K = p(0) - s(0) and s{t) + s'{t) = Ks = s(0) - s'(0) = s(0), (since 
s'(0) = 0) so that the resulting ODE for p{t) is 

^ = ki{K -p)p, 

with the solution 

p(t) 



l-^exp(-i^M)' 
and the other variables are found from the conserved quantities: 

s{t)=p{t)-K = p{t)-p{0) + s{0), 

s'{t) = Ks- s{t) = s(0) - sit). 

It is expedient to rescale the dependent variables in the original ODEs by the amount of one 
quantity at the beginning of the reaction, and since the nucleotides are present in excess we call 
that A^'o and define new variables: p = -^,3 = = j^^,h = -^,d = We also rescale time 

by define a new time ti = kit., where ki = kiN^. 

The solution is then 

'^''^ ^ l-liexp(-M)' 

s{ii) = pih) -k = p{h) - pio) + s(o), 

s'ih) =ks- s{h) = s{0) - s{h). 
where k = p{0) - s(0), k^ = s(0). 

The quantities in stage II can be computed in an identical manner, and with the choice of conserved 
quantities s'{t) + d{t) = Ka = s'(0) + d{0) = s'(0), n{t) - s'{t) = Kn = n(0) - s'(0) the solution is 

n{t) 



^ ~ Tr(^exp(-/C„/c2t) 
and the other variables are found from the conserved quantities: 

s'{t) = n{t) -Kn = n{t) - n(0) + s'(0), 

d{t) =Kd- s'{t) = s'(0) - (n(t) - n(0) + s'(0)) = n(0) - n{t). 

Rescaling again by the amount of nucleotide at the beginning of the first cycle, A'^o, and defining 
t2 = k2t, (where k2 = k2NQ) results in 

k„. 



1 - Jj5yexp(-J^„t2) 



s'(t2) = h{t2) -kn = h{t2) - n(0) + s'(0) 

d{t2) =kd- s'{t2) = n(0) - n(4) 

where = s'(0), Kn = h{0) — s'(0). The initial conditions can be written in terms of the rescaled 
variables, they are identical in form to H16|) and H17() . with X replacing X, for each variable. From 
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this point forward we rename X = X to simplify the notation, while keeping in mind how the 
rescahng changes the initial conditions. 



A map from one cycle to the next can be constructed from these solutions and the initial conditions 
H16|) and ()17|) . To distinguish between the concentration of primed single stranded DNA (ssDNA') 
in the first and second stage we designate them s'^ and s'^^ . Let the final time for each stage be 
fixed at r/ and r// respectively. The map for stage I is then: 

Pi[Tl) 



l-Sg}exp(-i^.r,)' 
where Ki = pi(0) — Sj(0), and 

Si{Tl) = Pi{Tl) - K = Pi{Tl) - pj(0) + Si(0), 

■sf (t/) = Ks- Si{Tj) = Sj(0) - Sj(r/), 

and for stage II: 

where Kn = ^,(0) — s'/^{0), and 

s'iirii) = ni{Tu) - Kn = ni{Tn) - ?T-i(0), 
di{Tn) = Kd- s'l^Tii) = f^^(0) - rii^Tn)- 

The initial conditions for the ith cycle are: 

Si{0) = Si_i(T/) + sfii(r//) + 2di_i{Tii), 

s'^(O) = 0.0, 
sf{0) = s'/{ri), 
ni(0) = nj_i(T7/), 
di{0) = 0.0. 



While a closed form version of the map can be written down, it is not particularly illuminating, 
simulations must be performed to uncover the behavior of the solutions. To do this parameters 
must be estimated or fit from the data, these are ti,tij, and the initial concentrations of primer 
and ssDNA, pi(0),si(0), relative to the initial concentration of nucleotides, A^o- The model and 
data must both saturate at the same level, and a scaling quantity for the data values was fit for 
each curve, which we called KK. The parameter estimation was done as in the previous section, 
with the same set of data. The percentage error in each fit was computed by dividing the final 
value of the objective function by the two-norm of the model time series vector. 

For the dilution series all these parameters were fit, see figure IHl and Table IVa. The KK value 
found is consistent with the variation of the saturation value for each run, and for each run ri 
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is greater than r2, indicating that anneahng is slower than extension. The initial quantity si(0) 
decreases roughly by a factor of 10 for each run, also consistent with the dilution series. The 
initial amount of primer in each run is close to 1.0, indicating the amount of primer must be 
close to the amount of nucleotide in a run to match the data, which is not consistent with the 
information we have about the experiment, which is that the nucleotides (measured in blocks of 
the template sequence to be amplified) to primer ratio is about 4-to-l. We return to this issue 
when we examine the model including Taq dynamics. The 6th run has a much larger error than 
the others, indicating that other factors come into play in the dynamics of very small copy number 
runs, such as competition from other species (e.g. primer-dimers) . 

We then fit the model to the runs with unknown initial concentrations, of which there are two, 
each with five replicates. For these 10 runs tj,tjj,pi(0) were fixed at values from the dilution run 
that comes out of background at nearly the same time (for the sample 1 this was run 2, for sample 
2, run 4). The results from this exercise are presented in Table IVb. 

Table IVa: Parameter Estimation of the Two Stage Model without Taq Dynamics-Dilution Series 





KK 


ri 


TII 


si(0) 




% error 


run 1 


8.989 


1.525 


0.829 


1.84e-04 


0.9935 


10.1 


run 2 


9.169 


1.551 


0.853 


7.23e-05 


0.9933 


9.8 


run 3 


8.768 


1.597 


0.916 


1.67e-05 


0.9930 


9.4 


run 4 


7.390 


1.466 


1.216 


1.70e-06 


0.9978 


8.0 


run 5 


4.902 


1.454 


1.598 


1.68e-07 


0.9636 


10.4 


run 6 


1.645 


4.787 


0.939 


8.00e-8 


0.991 


19.38 



Table IVb: Parameter Estimation of Two Stage Model w/out Taq Dyn. -Unknown Samples 





^i(O) 


KK 


% error 


sample la 


8.019e-05 


10.35 


14.7 


sample lb 


7.87e-05 


10.482 


13.0 


sample Ic 


7.67e-05 


10.518 


11.2 


sample Id 


7.57e-05 


10.713 


10.9 


sample le 


7.42e-05 


10.722 


10.7 


sample 2a 


1.51e-06 


7.158 


9.4 


sample 2b 


1.40e-06 


7.490 


8.6 


sample 2c 


1.40e-06 


7.518 


8.6 


sample 2d 


1.39e-06 


7.688 


8.5 


sample 2e 


1.50e-06 


7.430 


8.8 



3.2 Two Stage Model with Taq Dynamics 

For this model the equations for stage I remain as ((3)-©, while the stage II equations now include 
both Taq and complex concentrations and are given by (|10|) - (|12|) . As in the simpler model the 
initial amount of primed ssDNA in stage II is equal to the amount created in stage I, while the 
initial amount of nucleotide in stage II is whatever was left over from the previous cycle of stage 
II. Upon completion of stage II any unextended complex will break-up during dissociation, as will 
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cycle number 



Figure 6: Fitting dilution series data to the two stage model without Taq dynamics. For parameter 
values see Table IVa. 

any primed ssDNA. Thus the initial amount of complex in stage II will be zero, and the amount 
of Taq enzyme will be the original amount from the beginning of the PGR reaction (Q). Stage 
I starts with no primed ssDNA, it is assumed to dissociate during the melt phase. The primer 
initial condition is the amount of unused primer from the previous cycle, plus the amount created 
during the dissociation of the complex during the melt phase. The initial amount of ssDNA will 
be that left from the previous annealing phase plus an amount equal to amount of complex left in 
stage II that dissociates, plus the ssDNA that results from the dissociation of the dsDNA created 
in the previous stage II, which is double the amount of dsDNA. In terms of equations these initial 
conditions can be stated: 

s' (0) = s' {tend, Stage I); q{0) = Q; c(0) = 0.0; n(0) = n(ien(i! Previous stage II); d(0) = 0.0. 

(18) 

and for stage I: 

s(0) = s{tend, previous stage I) + c{tend, previous stage II) + 2d{tend, previous stage II); (19) 

p(0) = c{tendi previous stage II) + s'{tend, previous stage II) + p{tend, previous stage I); 

s'{0) = 0.0. 

A map for the reaction is created by integrating the ODEs in each stage and using the initial 
condition rules to link one stage to the other. However, insight can be gained by analyzing the 
dynamics of each stage separately and forming some special limiting cases for this map. 
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3.2.1 Dynamics of Stage I 



The equations for stage I are again completely integrable, because of two conserved quantities, 
s + s' = Kg = s'(0) + s(0) = s(0), and p — s = K = p(0) — s(0). Simplifying the equation for p 
using these quantities yields 

^ = k,iK-p)p. (20) 

Rescaling time, i = p{0)kit, ^ = and the molecular concentration of all three quantities by 
p{0), i.e. p = s = ^,s' = and k = ^^^^gj^ = 1 - s{0), results in 

p= [k - p)p, 

which has the solution 

p{t) = ^, (21 

l-s(0)e-^* ^ ^ 

so the remaining quantities can be computed 

s{i) = p{i) - k = m + m - h 

and 

s'{i) = ks-s{i) = '^-pii)- 

Note that in the limit as t — > oo, p{t) k k > 0, e.g. 1 > s(0), more initial primer than 
ssDNA, and p{t) — > if -R' < 0, i.e., there is more ssDNA to begin with than primer. There is a 
transcritical bifurcation at = 0, s(0) = 1, where the two fixed points for the system (|2U() . p = K 
and p = 0, exchange stability. 



3.2.2 Dynamics of Stage II 

The stage II ODEs can be simplified by rescaling time to remove one rate constant. The dimen- 
sionless time chosen, r, is k^2't and the new system is 

s' = --fs'q + c (22) 

q = -^s'q + C + Pen (23) 

c = 7s'g — c — I3cn (24) 

h = -Pen (25) 

d = Pen (26) 



wheref = /,/3=^,7=e- 

The initial conditions are (as stated previously): 

s'(0) = s'(tend5 stage I); q{0) = Q; c(0) = 0.0; n(0) = n(te„£^, previous stage II), d{0)=0.0. 
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The molecular quantities in the above equations can themselves be scaled, we choose here to scale 
by the initial amount of nucleotides available at the start of the cycle, Nq = n(0). The rescaled 
equations have the same form, with rescaled parameters 7 = cind /3 = P/Nq (meaning we 

adopt the notation X = ^ for each quantity, and then discard the hat for simplicity). The 
corresponding initial conditions are 

s'{0) = ^'(^^"^' ^*^g^ ^) . q{0) = -9-. c(0)=0.0; n(0) = 1, (i(0) = 0.0. 



The equations (P^ - H26() . have three conserved quantities (e.g. n+d, q+c, s+c+d), so the dynamics 
can be reduced from a five to a two-dimensional system. Since initially there will be no c or d, the 
these conserved quantities can be written s' + c + d = s'(0), q + c = q{0), n + d = 1. The two 
dimensional system that results from incorporating the conserved quantities is: 

q = -7(s'(0) - q{0) - I + q + n)q + q{0) -q + /3{q{0) - q)n 

h = —P{q{0) — q)n. 

From this we can more readily determine fixed points and analyze their stability. Two of the three 
fixed points for this system are physically relevant, and are given by 

f.p. 1 = (s' = 0, q = q{0), c = 0, n = 1 - s'(0), d=s'{0)), 

and 

f.p. 2={q= ^(7(g(0) + 1 - s'm - 1 + v/(-7(g(0) + 1 - s'{0)) + 1)2 + 475(0))), 
27 

s' = s'{0) -q{0) -1 + q, c = q{0) - q, n = 0, d=l). 
The third fixed point has the q-coordinate: 

Q = ^iliQiO) + 1 - s'(0)) - 1 - Vi-liliO) + 1 - ^'(0)) + 1)2 + 47^(0))) 
27 

which can be shown to be always negative, and so physically irrelevant. For a proof of this fact, 
and the complete stability analysis, see appendix I. Here we state the main result only: There is a 
transcritical bifurcation when s'{0) = 1 = n(0), here f.p. 1 and f.p. 2 are identical and exchange 
stability. For n(0) > s'{0) f.p. 1 is attracting and has non-negative coordinate values. For 
n(0) > s'(0) f.p. 2 is attracting with non-negative coordinate values. The reaction will typically 
begin with more nucleotides than primed ssDNA, so the long-time behavior is represented by f.p. 
1, which has final dsDNA value equal to the amount of primed ssDNA (all primed ssDNA is 
extended). If the initial amount of primed ssDNA exceeds the amount nucleotide at the start of 
that cycle, the second fixed point becomes attracting and non-negative, and in this case the long 
term dsDNA amount will be equal to the initial amount of nucleotide. Hence the limiting value 
of dsDNA switches from the initial amount of primed ssDNA to the initial amount of nucleotide 
as primer becomes limiting, as would be anticipated. 
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3.3 Limiting Cases 



Assuming both stage I and stage II ODEs reach steady state, a map can be constructed by hnking 
the attracting fixed points through the initial conditions. Since the rescahng of each stage used in 
the previous subsection is different, (and changes with each cycle) it is best to return to the unsealed 
quantities to create the map. The fixed points for stage II in terms of the unsealed quantities are 
the same, with the 1.0's replaced by n(0)'s. The bifurcation occurs when s'(0) = n(0), as was 
outlined previously. 

The attracting fixed point for stage I depends on the initial values of primer and ssDNA, we will 
write it (p, ,s, s^^ = p(0) — s(0), 0, s(0)) if p(0) > s(0) (there is enough primer to prime all the 
ssDNA). If p(0) < s(0) it is {p, s, s'^^ = 0, p{0) — s(0), p{0)): the limiting amount is primer. The 
attracting fixed point for stage II depends on the relative size of n(0) and s'(0), the latter being 
equal to s' from the previous stage I. The initial amount of nucleotide will be determined by how 
much remains from the previous cycle. In a similar manner the rest of the initial conditions are 
determined by the fixed points from the previous cycle, this dependance is detailed below. 

For stage I the initial amount of primer is the sum of what is left over from the previoTis stage I 
(cycle i — 1), and the primer released from the dissociation of the complex and the unextended 
primed ssDNA from the previous stage II: 

Pi{0) =pi-i + Ci-i + s'lli- 

The initial amount of ssDNA will be twice the amount of dsDNA created in the previous cycle, 
plus the ssDNA released from the dissociation of the complex and the unextended primed ssDNA 
from the previous stage II, plus the amount of ssDNA not primed in the previous stage I: 

Si{0) = 2di-i + Ci-i + s'll^ + Si-i. 

At the start of stage I there is no primed ssDNA: s^^(O) = 0. 

In stage II the initial amount of primed ssDNA is equal to the amount coming out of stage I, s'l , 
the initial amount of Taq is a constant, Q, the complex has been completely dissociated during 
the melt phase, along with any dsDNA. The amount of nucleotide is equal to the amount left over 
from the previous stage II, so the ICs are: 

s'^^(0) = sf; gi(0) = Q; Ci(0) = 0.0; ni(0) = n^-i; di(0) = 0.0. 

There will be four distinct cases of the map depending on the relative size of n(0),s'-f^(0) and 
]3(0),s(0). Here we limit the analysis to a physically realistic scenario: initially both primer and 
nucleotide dominate, but initial primer amount is less than initial nucleotide amount. In the course 
of creating new amplicon both primer and nucleotide amounts decrease, and since they are used in 
the same proportion the primers will be exhausted before the nucleotides. This will cause a shift 
to another case of the map, and the remaining two iterations will complete the process, since no 
more primer will be available to make the extension possible. The equations for this scenario are 
presented next. 

First assume Pi(0) > Si(0) and nj(0) > s — s'^^(O), so the stage I fixed point is (for the zth cycle) 

pi=p{Q)i-s{Q)i- Si = 0; sf = Si(0). 

20 



Then assume ni{0) > s'f {0) so that the fixed point for stage II is 

s'/^ = Ci = 0, qi = qiO). 

Hi = miO) - s'/\0), 
di = s'l\Q). 

Now, the initial conditions for cycle i are determined by the fixed points from cycle i — 1 in this 
manner: 

Pi(0) = Pi-i + Ci-i + = pi-i 
Si(0) = 2(ij_i + Ci-i + s'lli + Si-i = 2dj_i. 
Substituting these into the fixed point for stage I yields: 

Pi = pi-i - 2(ii_i; Si = 0; s'^ = 2di_i. 

The initial conditions for stage II in terms of fixed points for the previous cycle are 

ni{0) = ni-v, sf{0) = s'/ = 2di-i. 

And so the stage II fixed point is then 

sf = Ci = 0, qi = Q 

Hi = fii-i - 2(Jj_i 
di = 2di-\ 

This is simple doubling of the double-stranded DNA, and will proceed until the amount of primer 

(j9(0)) at the beginning of stage I is less than the amount of single-stranded DNA (•s(0)). At this 
cycle (call it A^) the attracting fixed point switches in stage I and a new map is created that is 
valid for exactly one cycle. The new stage I fixed point is: 

PAT = 

SN = SAr(O) -pAr(O) = 2dN-\ - Pn-i 
s'n =Pn{0) =pn-i 

In stage II the amount of nucleotide is depleted by an amount equal to the amount of s'^ created, 
which equals p v-i. The amount of double stranded DNA will be equal to that amount as well, so 
the map for stage II is: 

s'n = cat = 0, QN = g(0) 
HN = UN-l - PN-I 
dN = PN~1 

For the next cycle, + 1, there is no primer left so the duplication ends. The fixed point values 
in this cycle are: 

PN+i = 

SN+i = 2dN + sn = 2pN-i + 2dN-i - Pn-i = Pn-i + SJat-i = Sfinal 
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For stage II: 

flN+l = flN - PN = flN-1 - pN-1 = "final 

dN+1 = 

It is straight-forward to show that this is a fixed point for the map from one cycle to the next, 
once this stage is reached the final value of extended DNA is fixed. It lives in the reaction as single 
stranded DNA until the mixture is allowed to "finish off' and the strands reanneal (finishing off 
more technically refers to the stage in which the extension is allowed to run to completion, and all 
the primed single-stranded DNA molecules have been turned into double stranded DNA, which 
happens in cycle N). 

This map, created for the limiting case of infinite time for each stage in each cycle, converges on 
the model of simple doubling until the reaction limiting species (either p or n) is exhausted. Then 
in one cycle the reaction finishes off and reaches a fixed point. Clearly this does not capture the 
sigmoidal growth curve or the variation away from it. The next step is to allow the extension in 
stage II to reach the asymptotic fixed point determined by initial conditions for each cycle, but to 
use the exact solution of the stage I equations, with the run time left as a parameter, to determine 
the values at the end of stage I. We construct this map next. 

The exact solution for the stage I variables is as follows: 

P'^'^'^ = sm -KT = fTApm,m), (27) 

where K = Pi{0) — Si(0). The unprimed and primed ssDNA depend on p through the conserved 
quantities: 

Si{Ti)=pi{Ti) + Si{0)-pi{0) (28) 

sf{Ti)=piiO)-pi{Ti). (29) 
The stage II fixed point for n(0) > s'^^(O) > will be: 

s'/i = ci = 0, qi = Q 

= n,(0) - sf^(O) = riiiO) - s'/ (Ti) 
di = s"'iO) = s'/iTi)=pi{Q)-pi{Ti). 
The initial conditions for the next stage I are then 

Pi+i{0) = piiTi) +CH + sf = Pi{Ti), (30) 



Si+i(0) = 2di + Ci + s'/' + Si{Tj) = 2{pi{0)-pi{Ti))+pi{Tj) + Si{0)-pi{0)=Pi{0)-pi{Tj) + Si{0), 

(31) 

and 

Si+M=0. (32) 

Note that at this point the stage I initial conditions, in which the map is cast, depend only on the 
previous stage I values, the stage I map runs independently of stage II. The stage II fixed point 
can be determined directly from the stage I variables, and the only one of interest is the amount 
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of nucleotide, for when that is exhausted the reaction will stop. The equation for nucleotide, n, is: 
ni+i(0) = n, = miO) - s'/{Tj) = ni{0) - {pi{0) - Pi{Ti)). (33) 

There are two possibilities for the completion of the reaction: either primer runs out or the 
nucleotides. In the case that primer runs out first, we look at the limit as Pi{0) 0, so that 
Pi(Ti) — > and Si{Tj) — > Si(0) and s[(Tj) 0. In stage II no complex will be formed, or 
double-stranded DNA created, as there is no primed ssDNA available at the start of the reaction: 
^'i^i^) — ^iC^i) = 0- The fixed point for the nucleotide is thus the value of the nucleotide at 
the beginning of the cycle, fii = ni(0) — s'/{Tj) = nj(0), and the map for is at a fixed point, 
ni+i(0) = ni(0). 

If the resource is the limiting factor, rather than primer, in the second to last cycle ((r — l)-cycle) 
we have nr_i(0) < s[!£^{0). This sends stage II variables to f.p. 2. In the final stage I the initial 
conditions are then 

Pr(0) = Pr-l(Tj) + Cr-1 + s'^^li 
SriO) = 2dr~l + Cr-l + s'^li + Sr-l{Ti), 

with f.p. 2 values. At the end of stage I the function values are: 

Pr{Tl) = hj-, Sr(Tj) = Jt, + Sr-(O) - p.(0); sUTi) = Pr-(O) " /t, ■ 

The initial condition for s'^\Q) = s'^{Ti) = Pr(0) — /tj, but only complex can be formed in the 
final stage II, since the resources have been exhausted, 77.^(0) = 0. During the dissociation phase 
the complex breaks up, and the initial quantities for stage I are pr+i(0) = Pr (0); s^._|_i(0) = 5^(0). 
The reaction has reached a fixed point with {p = pr(0), s = 5^(0), n = 0). 

To illustrate the dynamics of this map we plot nj, Si and pi in figure [3 Six runs were performed 
with varying stage I integration time: (i/ = 0.5, 0.75, 1.0, 1.5, 2.0). The initial conditions are ng = 
1.0; po = 0.25; So = 0.001, i.e., the case in which primer is limiting. To further analyze the eff'ect 
of varying integration time we plot the logarithmic regression of s, the quantity Yi = Ind ^^^^ |), 
in figure ISl Here we see that in the limit of shorter integration times the growth is more nearly 
logistic, and for longer integration times it deviates from logistic by being concave up, rather than 
concave down, which is what is seen in the amplification data (figure IHl). Including variation in 
stage I integration time is clearly not enough to capture the correct non-sigmoidal behavior of the 
growth curves, some other part of the reaction dynamics must explain the accentuated slowing of 
growth during the latter half of the reaction. This leads us to integrating and fitting parameters 
on the full model, eq.s (HI)-©, and Q-®. 



3.4 Parameterizing the Full Two Stage Model with PGR Data 

We now investigate the parameterizations of the model with arbitrary time in stage I and in 
stage II. The reactions in the annealing phase (stage I) are the same those presented in equations 
()20|) . (|21() . and the linking initial conditions are H19|). The stage II ODEs are given in equations 
(|22|) -(|2(i| ) with rescaled parameters and initial conditions: 

g^(0) = ^'ftend, stage I) ^ ^(0) = ^; c(0) = 0.0; n(0) = 1.0, d(0) = 0.0. 
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Figure 7: Integration of map with varying stage I integration time. Arrows point in the direction 
of decreasing stage I integration time. 




cycle number i 

Figure 8: Logarithmic regression variable Yn from the two-stage map with varying stage I integra- 
tion time. 
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Figure 9: Integration of the coupled ODE model, with varying initial template amount, so(0). 

Integrating the complete model occurs in phases, first computing the solution of stage I, eq. (|21|) . 
then the value of the stage I variables at T/ are used as initial conditions eq. ()18() for the stage II 
integration, eqs. ©-(HI). The following dissociation phase breaks up existing complex into ssDNA, 
primer and Taq, and dsDNA into twice as many ssDNA strands, and these are used as initial 
conditions for the next stage I (see eq. (|19j) ). Examples of runs with varying amounts of initial 
template are shown in figure El 

The parameter estimation was done using the same quantitative PGR data set in the previous 
sections, again using the Matlab function fniinsearch to minimize the mean square error between 
the amplification data and the simulated time series. In performing the parameterizations we used 
the value determined for the initial s level from fitting the two stage model without Taq dynamics. 
We fit the normalization constant for the data, KK, the two reaction coefficients, /3 and 7, and 
the reaction times tj and tjj. That leaves the initial amount of primer, pi{0), and of taq, Q, 
relative to the initial amount of nucleotides. From the results of many parameterization runs we 
determined that the best fit was obtained when pi(0) = 1.0, which is not what is indicated by IT 
protocol, where a standard reaction set-up has 0.5 micromole of each primer and 0.8 millimole of 
dNTPs, the base pairs (BP) used in extension. Given an amplicon of 200 BPs this means about 
4 micromole of completed segments, or 2 micromole of each complementary segment. The ratio 
of primers to nucleotides is about four to one, then, so we should set the initial primer amount 
to 0.25. This never achieved the same goodness-of-fit that the runs with higher initial amounts 
of primer did. On the other hand, the parameterization was relatively insensitive to the initial 
amount of Taq polymerase, which we set at 1.0. See Tables Va,b for the parameterization results. 

In figure a) we show a comparison of data to model with parameters found by the algorithm, 
for the dilution series. The logarithmic regression variable, Yi = log( j^^'_y ) was plotted for model 
and the same data in figure ^1 b) . 
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Figure 10: Comparison of two stage model with Taq dynamics to data, a) amplification curves, 
b) logarithmic regression curves. The data is represented with dashed lines. See the text for 
parameter values. 
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Table Va: Parameter Estimation: Two Stage Model with Taq Dynamics-Dilution Series 





KK 


si{0) 


P 


7 


ti 


tii 


% error 


run 1 


9.126 


1.8e-04 


4.0267 


0.8650 


2.2137 


1.1521 


2.57 


run 2 


9.3252 


7.23e-05 


5.4532 


1.0404 


2 Mm 


0.9242 


2.62 


run 3 


8.9122 


1.67e-05 


4.8744 


0.9028 


2.2876 


1.1484 


2.66 


run 4 


7.5077 


1.70e-06 


4.6310 


0.9777 


2.5459 


1.2457 


2.43 


run 5 


5.0893 


1.68e-07 


4.3792 


0.9423 


2.0108 


1.6888 


3.05 


run 6 


1.6993 


8.00e-08 


5.9758 


0.6248 


1.9002 


2.3773 


5.52 



We then performed the parameter estimation with data from the replicates with unknown initial 
concentrations, the results are presented in Table Vb. 

Table Vb: Parameter Estimation: Two Stage Model with Taq Dynamics-Unknown Samples 





KK 


si(0) 




7 


ti 


tii 


% error 


sample la 


10.13 


8.02e-05 


0.8133 


1.5316 


2.5651 


1.8070 


3.42 


sample lb 


10.49 


7.87e-05 


1.7851 


0.7927 


3.3509 


1.6211 


3.32 


sample Ic 


10.42 


7.67e-05 


3.1262 


2.2279 


4.3212 


0.6089 


2.86 


sample Id 


10.73 


7.57e-05 


4.5382 


1.3121 


4.2464 


0.7309 


2.86 


sample le 


10.82 


7.42e-05 


5.4529 


1.3825 


3.1713 


0.6814 


2.90 


sample 2a 


7.18 


1.51e-06 


7.6983 


1.2537 


4.7358 


0.7902 


2.76 


sample 2b 


7.54 


1.40e-06 


4.6407 


0.7659 


4.2093 


1.3862 


2.60 


sample 2c 


7.60 


1.40e-06 


5.4337 


0.9113 


3.6034 


1.1658 


2.67 


sample 2d 


7.78 


1.39e-06 


6.7849 


0.8755 


3.4071 


1.1475 


2.64 


sample 2e 


7.50 


1.50e-06 


4.7158 


0.7127 


3.6256 


1.4914 


2.73 



Comparing the fitted parameters for the different replicates in the unknown sample runs points 
out an obvious flaw with this parameterization. There is clearly more than one parameter set 
with an equally good fit to the data, meaning hidden dependencies in the parameters that simple 
rescaling cannot uncover. Determining these dependencies through alternative rescalings and 
singular perturbation analysis, and through other parameter estimation techniques is the subject 
of ongoing research. 

Turning to the problem of realistic initial primer concentration, we found that for values much 
less than 1.0 the model did not capture the non-logistic behavior of the data. This can be seen 
in figure ^Jb), where the curves have a concave down portion near the end of the run, indicating 
the slowing of growth of the amplicon. In figure ^2 we illustrate this by graphing the logarith- 
mic regression variable for differing values of initial primer, with all other parameters fixed. In 
performing parameter estimation, and taking lower initial primer concentration, we found that we 
could not overcome the logistic-type behavior by varying other reaction parameters. A heuristic 
explanation for this is that the nucleotides must be running out as well as primer in this model 
to get the slower than logistic growth at the end of the run. The equation for h (from the two 
dimension reduction) is 

n = -Piqio) - q)n, 

so the rate of loss of n is proportional to n. The amount of dsDNA created is equal to n(0) — n(t) at 
the end of the cycle, so slower loss of n means slower growth of d and hence s. If the initial primer 
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Figure 11: Integration of the two stage model with Taq Dynamics, with varying initial primer 
amount, po{0). 

concentration is not close to that of the initial nucleotide concentration the quantity n (which is 
depleted necessarily at the same rate as p) will not become small, and the rate of creation of s will 
not slow accordingly. 

4 Discussion and Conclusions 

In this paper we have analyzed a sequence of models for the reactions of PGR. Exponential growth, 
the first order approximation of simple doubling of the DNA strands, is replaced by a logistic 
model which captures the sigmoidal nature of the amplification curves. Both these models are 
in common use in current devices. We postulated a variation on the logistic model where the 
efficiency decreases in time, and were able to fit the data with good results. This, however, is less 
satisfactory than a model built directly from the reactions that captures the decrease in efficiency 
as cycle number increases. We then built two such models, one that does not include the enzyme 
Taq directly, and a second which does. 

The model that did not include Taq dynamics is solvable analytically, at least for each stage. A 
map is created by linking the closed form solutions through their initial conditions. It was found 
that the data could be well estimated by reasonable parameter values if the initial amount of primer 
was taken close to that of the initial amount of nucleotide. This is not the protocol followed in 
the experiment, however, where for a 200 BP sequence the primer: nucleotide ratio is about 1:4. It 
appears that the amount of nucleotides in the reaction must decrease significantly by the end of the 
reaction in order to obtain a decrease in overall reaction efficiency. For this to happen the initial 
concentration of primer must about that of the nucleotide. From these observations we concluded 
that the model without Taq dynamics did not capture the full behavior of the amplification curves. 

The second model is built on reactions that include the formation of a complex of primed ssDNA 
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and the Taq enzyme. The equations of stage I could still be solved analytically, but this was not 
possible for stage II. Instead limiting cases of long integration time in stage I or stage II or both 
were considered, and by analyzing the amplification curves created by these maps we concluded 
that none of the limiting cases, including variable annealing and long time extension, would create 
the desired behavior at the end of the reaction. If the extension stage of the reaction was very fast 
compared to the annealing phase, you might expect to capture the qualitative behavior with this 
last case. 

As none of the limiting cases created the qualitative end-of-reaction behavior we thought the 
data demonstrate so clearly, we proceeded to parameterize the full equations for the two stage 
model with Taq dynamics. With a map created from solutions to these we were able to find 
parameters that captured the decreased efficiency at the end of the reaction, but only if initial 
primer concentration was again roughly the same size as the initial concentration of nucleotides. 
The extension phase of the reaction would need to slow accordingly to fit this aspect of the behavior. 
Also, multiple sets of parameters were found to fit the same amplification run, indicating hidden 
dependencies in the parameters that simple rescaling does not uncover. 

While we were not able to completely explain the reduced efficiency seen in the data with our suite 
of models, we were able to determine what portions of the model were important in capturing its 
non-logistic character. Competing reactions at higher cycle numbers most certainly will have an 
effect on the efficiency, especially with the lower initial copy number runs. Future work will include 
analyzing these dependencies both numerically and analytically, and using the two stage model to 
seek reaction protocols that minimize time to almost complete creation of amplicon, and maximize 
yield for a fixed total cycle time. 
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Appendix: Linear stability analysis for stage II with Taq Dynamics 

This stability analysis uses the two dimensional system that takes advantage of the conserved 
quantities, s' + c + d = s'(0), q + c = q'(O), and n + d = 1. In the following 7, (5 and all the initial 
quantities are positive. 

q = -7(s'(0) - g(0) -l + q + n)q + g(0) - q + /3(g(0) - q)n 

h = -/3(g(0) - q)n. 
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The fixed points 



Solving q = and n = simultaneously gives three fixed points. 

f.p.l = iq = q{0),n=l-s'{0)) 

f.p.2 = {q = ^ij{q{0) + 1 - s'(0)) - 1 + V(-7(«(0) + 1 - s'{0)) + 1)2 + 47g(0)), 

n = 0) 

The third fixed point has 

Q = ^ilim + 1 - s'm - 1 - V(-7(9(0) + 1 - s'(0)) + 1)2 + 47g(0)), 
which is always negative. Proof: 

^is aroot of the polynomial 7^2 + (-7(qr(0) + l-s'(0)) + l)g-g(0). Let b = -7(9(0) + l-s'(0)) + l. 
Then 

^/WT^^ >b, 

therefore, 

-b - V62 + 4^g(o) < 0. 
Since f.p.3 is not of physical importance, we will only analyze the stability of f.p. 1 and f.p. 2. 

Stability of f.p.l 

The Jacobian for the reduced system is 

" _7g,_7 (s'(o) -g(0) -l + q + n)-l-Pn q + /3 {q{0) - q) ' 

Let s'(0),g(0) > 0, c(0) = 0,n(0) = 1, and d(0) = 0. Also let 7,/3 > 0. For f.p. 1, the Jacobian 
becomes 

- _^^(o)_i_/3 (i_s'(0)) -79(0) " 
P{l-s'{0)) 

The eigenvalues are, 

l/2(-79(0) - 1 - ^(1 - s'm ± V(79(0) + 1 + /3(1 - s'(0)))2 - 479(0)/3(l - s'(0))). 

The location of the first fixed point and the sign of the nonzero eigenvalues depend on the rela- 
tionship between n(0) = 1 and s'(0). 

Casel: 1 > s'(0) =^ 1 — s'(0) > 0. Both coordinates of the first fixed point are non-negative. The 
eigenvalues are real and negative. They are real because the discriminant is greater than zero. 
Proof: 

(79(0) + 1 + /3(1 - s'mf - 47g(0)/5(l - s'{0)) 
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= 7'g(0)' + 27^(0) + 27g(0)/3(l - s\0)) + 2/3(1 - s'(0)) + 1 + /^^(l - ^'(O))^ - 4jq{0)P{l - s'{0)) 
= 7^g(0)^ - 2jq{0)P{l - s'{0)) + ^2(1 - s'(0))2 + 27g(0) + 2/3(1 - s'(0)) + 1 
= (79(0) - /3(1 - s'mf + 27^(0) + 2/3(1 - s\0)) + 1 > 0. 
They are both negative. Proof: 

Let b = 7^(0) + l + s'(0)). Then 6^ > 47g(0)/3(l - s'(0)) as shown above. 

y/b^ - 47g(0)/3(l - s'iO)) < b, 

therefore, 

-6+ Vb2-47g(0)/3(l-s^(0)) 
2 

and 

-b - V62 - 47g(0)/3(l - s'jO)) 
2 

are both < 0. 

Case 2: n(0) = 1 = s'(0) ^ 1 - s'(0) = 0. The first fixed point becomes (g(0),0) which still is 
nonnegative. The eigenvalues become: 

i(-7g(0) - 1 + V(79(0) + 1)2) = ^(-7?(0) - 1 + 79(0) + 1) = 

and 

^(-79(0) - 1 - V(7g(0) + 1)2) = ^(-27^(0) - 2) = -79(0) - 1. 
For this case, there is only one nonzero eigenvalue, — 7(?(0) — 1, which is negative. 

Case 3: n(0) = 1 < s'{0) ^ 1 — s'(0) < 0. The coordinate, n = 1 — s'(0), of the first fixed point is 
now negative. The eigenvalues, 

l/2(-7^(0) - 1 - /3(1 - s'm ± V(79(0) + 1 + /3(1 - s'(0)))2 - A^q{0)P{l - ^'(0))), 

are real. Proof: 

Let a > and let 1 — s'(0) = —a. Let 

b = 7^(0) + 1 + l3{-a) = 7^(0) + 1-Pa. 
b^ - 475(0)/3(-a) = h^ + 47^(0)/3a, 

therefore 

b^ + 47g(0)/3a > 0. 
One of these eigenvalues is positive and the other is negative. Proof: 

6^ + 47g(0)/3a > 0, so ^Jb'^ + A'^q{Q)l3a > b. Then 

-b+^/W+^^^^ 
2 

and 

-b-^b^+4.-iq{[))(5a 
2 
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stability of f.p.2 

For f.p.2, the Jacobian becomes 



-79-7 (s'(0)-g(0)-l + g)-l -79 + /? (9(0)-g) 
-/? (q(0) - q) 

where q = ^(7(g(0) + 1 - s'(0)) - 1 + a/(-7(q(0) + 1 - s'(0)) + 1)2 + 47^(0)). The eigenvalues are 

^^(-7^(0) + 7(1 - s'(0)) - 1 + V(7g(0) - 7(1 - ^'(0)) + I? + 472(Z(0)(1 - .'(0)), 
27 

- V(79(0) - 7(1 - ^'(0)) + 1)2 + 472g(0)(l - s'(0)). 
Casel: 1 — s'(0) > For this case q is positive. Proof: 



Let h = -7(g(0) + 1 - s'(0)) + 1. 6^ + A-fq{<d) > 0, so yPTi^MO) > b. Therefore, 

-6+ v/62 + 47g( 0)) 
27 

The eigenvalues are both real. Proof: 



> 0. 



(7g(0) - 7(1 - s'm + 1)2 > 0. 

Also, for 1 - s'(0) > 0, 472^(0) (1 - s'(0)) > 0. Therefore, the discriminant is positive. The 
eigenvalue 

- V(79(0) - 7(1 - s'{0)) + 1)2 + 472g(0)(l - s'{0)) 
is clearly negative. The other eigenvalue is positive for this case. Proof: 

Let b = 7^(0) - 7(1 - s'(0)) + 1. 

6< V62 + 4^2g(-0)(l-s'(0)) 

therefore, 

-b + ^/b'^+4j^q{0){l-s'{Q)) > 

Case 2: 1 — s'(0) = 0. For this case, f.p.l = f-P-2. Proof: 



Q = Tri^Qio) - 1 + V(-79(0) + 1)2 + 479(0)) 
27 



^{7q{0) - 1 + ^-7^0)2- 279(0) + 1 + 479(0)) 
27 

= :^(79(0) - 1 + 7-729(0)2 + 279(0) + 1 
27 

= ^(79(0) - 1 + V(79(0) + 1)2 
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= ^(79(0) - 1 + 79(0) + 1) = ^(279(0)) 
= QiO)- 

f.p.2 becomes q = {q{0),n = 0) which equals f.p.l. 

Case 3: 1 — s'(0) < 0. The eigenvalues remain real for this case, but the one that was positive in 
Case 1 becomes negative. Proof: 

Let -a = 1 - s'(0), a > 0, and let b = jq{0) +ja + l. Then 

b > - 4j^q{0)a 

therefore, 

-b + - 472g(0)a < 

Summciry 

For Case 1, f.p.l is a sink, but becomes a saddle as s'(0) becomes larger than n(0)(for Case 3). The 
opposite occurs for f.p.2. It starts out as a saddle (for Case 1) and becomes a sink (for Case 3). 
This indicates a transcritical bifurcation when the initial amount of nucleotides equals the initial 
amount of primed single-strand DNA. 
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